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Abstract 


We sequenced the genomes of a ~7,000 year old farmer from Germany and eight ~8,000 year old 
hunter-gatherers from Luxembourg and Sweden. We analyzed these and other ancient genomes!-^ 
with 2,345 contemporary humans to show that most present Europeans derive from at least three 
highly differentiated populations: West European Hunter-Gatherers (WHG), who contributed 
ancestry to all Europeans but not to Near Easterners; Ancient North Eurasians (ANE) related to 
Upper Paleolithic Siberians?, who contributed to both Europeans and Near Easterners; and Early 
European Farmers (EEF), who were mainly of Near Eastern origin but also harbored WHG-related 
ancestry. We model these populations’ deep relationships and show that EEF had ~44% ancestry 
from a “Basal Eurasian" population that split prior to the diversification of other non- African 


lineages. 


Near Eastern migrants played a major role in the introduction of agriculture to Europe, as 
ancient DNA indicates that early European farmers were distinct from European hunter- 
gatherers*> and close to present-day Near Easterners^. However, modelling present-day 
Europeans as a mixture of these two ancestral populations does not account for the fact that 
they are also admixed with a population related to Native Americans??. To clarify the 
prehistory of Europe, we sequenced nine ancient genomes (Fig. 1A; Extended Data Fig. 1): 
"Stuttgart" (19-fold coverage), a ~7,000 year old skeleton found in Germany in the context 
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of artifacts from the first widespread farming culture of central Europe, the 
Linearbandkeramik; *Loschbour" (22-fold), an «8,000 year old skeleton from the 
Loschbour rock shelter in Luxembourg, discovered in the context of hunter-gatherer artifacts 
(SI1; SI2); and seven -8,000 year old samples (0.01—2.4-fold) from a hunter-gatherer burial 
in Motala, Sweden (the highest coverage individual was *Motalal2"). 


Sequence reads from all samples revealed >20% C—T and G—4A deamination-derived 
mismatches at the ends of the molecules that are characteristic of ancient DNA?.10 (SI3). We 
estimate nuclear contamination rates to be 0.3% for Stuttgart and 0.4% for Loschbour (SI3), 
and mitochondrial (mtDNA) contamination rates to be 0.3% for Stuttgart, 0.4% for 
Loschbour, and 0.01-5% for the Motala individuals (SI3). Stuttgart has mtDNA haplogroup 
T2, typical of Neolithic Europeans!!, and Loschbour and all Motala individuals have the U5 
or U2 haplogroups, typical of hunter-gatherers?? (SI4). Stuttgart is female, while Loschbour 
and five Motala individuals are male (SI5) and belong to Y-chromosome haplogroup I, 
suggesting that this was common in pre-agricultural Europeans (SI5). 


We carried out large-scale sequencing of libraries prepared with uracil DNA glycosylase 
(UDG), which removes deaminated cytosines, thus reducing errors arising from ancient 
DNA damage (SI3). The ancient individuals had indistinguishable levels of Neanderthal 
ancestry when compared to each other (~2%) and to present-day Eurasians (SI6). The 
heterozygosity of Stuttgart (0.00074) is at the high end of present-day Europeans, while that 
of Loschbour (0.00048) is lower than in any present humans (SI2), reflecting a strong 
bottleneck in Loschbour's ancestors as the genetic data show that he was not recently inbred 
(Extended Data Fig. 2). High copy numbers for the salivary amylase gene (AMY/) have 
been associated with a high starch diet!?; our data are consistent with this finding in that the 
ancient hunter gatherers La Braña (from Iberia)2, Motalal2, and Loschbour had 5, 6 and 13 
copies respectively, whereas the Stuttgart farmer had 16 (SI7). Both Loschbour and Stuttgart 
had dark hair (>99% probability); and Loschbour, like La Braña and Motala12, likely had 
blue or intermediate-colored eyes (>75%) while Stuttgart likely had brown eyes (>99%) 
(SI8). Neither Loschbour nor La Braña carries the skin-lightening allele in $LC244A5 that is 
homozygous in Stuttgart and nearly fixed in Europeans today?, but Motalal2 carries at least 
one copy of the derived allele, showing that this allele was present in Europe prior to the 
advent of agriculture. 


We compared the ancient genomes to 2,345 present-day humans from 203 populations 
genotyped at 594,924 autosomal single nucleotide polymorphisms (SNPs) with the Human 
Origins array? (SI9) (Extended Data Table 1). We used ADMIXTUREP to identify 59 
“West Eurasian" populations that cluster with Europe and the Near East (SI9 and Extended 
Data Fig. 3). Principal component analysis (PCA)!4 (S110) (Fig. 1B) indicates a 
discontinuity between the Near East and Europe, with each showing north-south clines 
bridged only by a few populations of mainly Mediterranean origin. We projected!» the 
newly sequenced and previously published! ancient genomes onto the first two principal 
components (PCs) (Fig. 1B). Upper Paleolithic hunter-gatherers? from Siberia like the MA1 
(Mal’ta) individual project at the northern end of the PCA, suggesting an “Ancient North 
Eurasian” meta-population (ANE). European hunter-gatherers from Spain”, Luxembourg, 
and Sweden? fall beyond present-day Europeans in the direction of European differentiation 


Nature. Author manuscript; available in PMC 2015 March 18. 


1duosnue|N Jouiny 1duosnue|N Jouiny 1duosnue|N Jouiny 


1duosnue|N Jouiny 


Lazaridis et al. 


Page 3 


from the Near East, and form a “West European Hunter-Gatherer" (WHG) cluster including 
Loschbour and La Brafia?, and a “Scandinavian Hunter-Gatherer” (SHG) cluster including 
the Motala individuals and ~5,000 year old hunter-gatherers from the Pitted Ware Culture‘. 
An “Early European Farmer” (EEF) cluster includes Stuttgart, the ~5,300 year old Tyrolean 
Iceman! and a ~5,000 year old Swedish farmer’. 

Patterns observed in PCA may be affected by sample composition (SI10) and their 
interpretation in terms of admixture events is not straightforward, so we rely on formal 
analysis of f-statistics? to document mixture of at least three source populations in the 
ancestry of present Europeans. We began by computing all possible statistics of the form 
f3(Test; Refi, Ref») (S111), which if significantly negative show unambiguously? that Test is 
admixed between populations anciently related to Ref; and Ref; (we choose Ref; and Ref; 
from 5 ancient and 192 present populations). The lowest f3-statistics for Europeans are 
negative (93% are >4 standard errors below 0), with most showing strong support for at least 
one ancient individual being one of the references (SI11). Europeans almost always have 
their lowest f3 with either (EEF, ANE) or (WHG, Near East) (SI11, Table 1, Extended Data 
Table 1), which would not be expected if there were just two ancient sources of ancestry (in 
which case the best references for all Europeans would be similar). The lowest f3-statistic for 
Near Easterners always takes Stuttgart as one of the reference populations, consistent with a 
Near Eastern origin for Stuttgart’s ancestors (Table 1). We also computed the statistic 
fa(Test, Stuttgart; MAI, Chimp), which measures whether MA/ shares more alleles with a 
Test population or with Stuttgart. This statistic is significantly positive (Extended Data Fig. 
4, Extended Data Table 1) if Test is nearly any present-day West Eurasian population, 
showing that MA1-related ancestry has increased since the time of early farmers like 
Stuttgart (the analogous statistic using Native Americans instead of MA1 is correlated but 
smaller in magnitude (Extended Data Fig. 5), indicating that MAL is a better surrogate than 
the Native Americans who were first used to document ANE ancestry in Europe^5). The 
analogous statistic f4(Test, Stuttgart; Loschbour, Chimp) is nearly always positive in 
Europeans and negative in Near Easterners, indicating that Europeans have more ancestry 
from populations related to Loschbour than do Near Easterners (Extended Data Fig. 4, 
Extended Data Table 1). Extended Data Table 2 documents the robustness of key f4- 
statistics by recomputing them using transversion polymorphisms not affected by ancient 
DNA damage, and also using whole-genome sequencing data not affected by SNP 
ascertainment bias. Extended Data Fig. 6 shows the geographic gradients in the degree of 
allele sharing of present-day West Eurasians (as measured by f,-statistics) with Stuttgart 
(EEF), Loschbour (WHG) and MA1 (ANE). 


To determine the minimum number of source populations needed to explain the data for 
many European populations taken together, we studied the matrix of all possible statistics of 
the form f4(Testpase, Testi; Obase Oj) (S112). Testpase is a reference European population, 
Test; is the set of all other European Test populations, Opgse is a reference outgroup, and Oj 
is the set of other outgroups (ancient DNA samples, Onge, Karitiana, and Mbuti). The rank 
of the (i, j) matrix reflects the minimum number of sources that contributed to the Test 


16,17 


populations . For a pool of individuals from 23 Test populations representing most 


present-day European groups, this analysis rejects descent from just two sources (P<10~!2 
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by a Hotelling T-test!"). However, three source populations are consistent with the data after 
excluding the Spanish who have evidence for African admixture!*-?° (P=0.019, not 
significant after multiple-hypothesis correction), consistent with the results from 
ADMIXTURE (SI9), PCA (Fig. 1B, SI10) and f-statistics (Extended Data Table 1, Extended 
Data Fig. 6, SI11, SI12). We caution that the finding of three sources could be consistent 
with a larger number of mixture events. Moreover, the source populations may themselves 
have been mixed. Indeed, the positive f2(Stuttgart, Test; Loschbour, Chimp) statistics 
obtained when Test is Near Eastern (Extended Data Table 1) imply that the EEF had some 
WHG-related ancestry, which was greater than 0% and as high as 45% (SI13). 


We used the ADMIXTUREGRAPH software?? to fit a model (a tree structure augmented 
by admixture events) to the data, exploring models relating the three ancient populations 
(Stuttgart, Loschbour, and MA1) to two eastern non-Africans (Onge and Karitiana) and sub- 
Saharan Africans (Mbuti). We found no models that fit the data with O or 1 admixture 
events, but did find a model that fit with 2 admixture events (SI14). The successful model 
(Fig. 2A) confirms the existence of MA1-related admixture in Native Americans?, but 
includes the novel inference that Stuttgart is partially (44 + 10%) derived from a lineage that 
split prior to the separation of eastern non- Africans from the common ancestor of WHG and 
ANE. The existence of such “Basal Eurasian" admixture into Stuttgart provides a simple 
explanation for our finding that diverse eastern non-African populations share significantly 
more alleles with ancient European and Upper Paleolithic Siberian hunter-gatherers than 
with Stuttgart (that is, f2( Eastern non-African, Chimp; Hunter-gatherer, Stuttgart) is 
significantly positive), but that hunter-gatherers appear to be equally related to most eastern 
groups (SI14). We verified the robustness of the model by reanalyzing the data using the 
unsupervised MixMapper? (SI15) and TreeMix?! software (SI16), which both identified the 
same admixture events. The ANE/WHG split must have occurred >24,000 years ago (as it 
must predate the age of MA1°), and the WHG/Eastern non-African split must have occurred 
>40,000 years ago (as it must predate the Tianyuan?? individual from China which clusters 
with Asians to the exclusion of Europeans). The Basal Eurasian split must be even older, 
and might be related to early settlement of the Levant?? or Arabia242? prior to the 
diversification of most Eurasians, or more recent gene flow from Africa?6, However, the 
Basal Eurasian population shares much of the genetic drift common to non-African 
populations after their separation from Africans, and thus does not appear to represent gene 
flow between sub-Saharan Africans and the ancestors of non-Africans after the out-of-Africa 
bottleneck (SI14). 


Fitting present-day Europeans into the model, we find that few populations can be fit as 2- 
way mixtures, but nearly all are compatible with 3-way mixtures of ANE/EEF/WHG (SI14). 
The mixture proportions from the fitted model (Fig. 2B; Extended Data Table 3) are 
encouragingly consistent with those obtained from a separate method that relates European 
populations to diverse outgroups using fz-statistics, assuming only that MAI is an unmixed 
descendent of ANE, Loschbour of WHG, and Stuttgart of EEF (SI17). We infer that EEF 
ancestry in Europe today ranges from ~30% in the Baltic region to ~90% in the 
Mediterranean, consistent with patterns of identity-by-descent (IBD) sharing?7:28 (S118) and 
shared haplotype analysis (chromosome painting)?? (SI19) in which Loschbour shares more 
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segments with northern Europeans and Stuttgart with southern Europeans. Southern 
Europeans inherited their European hunter-gatherer ancestry mostly via EEF ancestors 
(Extended Data Fig. 6), while Northern Europeans acquired up to 50% of WHG ancestry 
above and beyond the WHG-related ancestry which they received through their EEF 
ancestors. Europeans have a larger proportion of WHG than ANE ancestry in general. By 
contrast, in the Near East there is no detectable WHG ancestry, but up to ~29% ANE in the 
Caucasus (SI14). A striking feature of these findings is that ANE ancestry is inferred to be 
present in nearly all Europeans today (with a maximum of ~20%), but was absent in both 
farmers and hunter-gatherers from central/western Europe during the Neolithic transition. At 
the same time, we infer that ANE ancestry was not completely absent from the larger 
European region at that time: we find that it was present in ~8,000 years old Scandinavian 
hunter-gatherers, since MA1 shares more alleles with Motala12 (SHG) than with Loschbour, 
and Motalal2 fits as a mixture of 8196 WHG and 19% ANE (SI14). 


Two sets of European populations are poor fits for the model. Sicilians, Maltese, and 
Ashkenazi Jews have EEF estimates of » 10046 consistent with their having more Near 
Eastern ancestry than can be explained via EEF admixture (SI17). They also cannot be 
jointly fit with other Europeans (SI14), and they fall in the gap between European and Near 
Easterners (Fig. 1B). Finns, Mordovians and Russians (from the northwest of Russia) also 
do not fit (SI14; Extended Data Table 3) due to East Eurasian gene flow into the ancestors of 
these northeastern European populations. These populations (and Chuvash and Saami) are 
more related to East Asians than can be explained by ANE admixture (Extended Data Fig. 
7), likely reflecting a separate stream of Siberian gene flow into northeastern Europe (SI14). 


Several questions will be important to address in future ancient DNA work. Where and 
when did the Near Eastern farmers admix with European hunter-gatherers to produce the 
EEF? How did the ancestors of present-day Europeans first acquire their ANE ancestry? 
Discontinuity in central Europe during the late Neolithic (~4,500 years ago) associated with 


30 raises the 


the appearance of mtDNA types absent in earlier farmers and hunter-gatherers 
possibility that ANE ancestry may have also appeared at this time. Finally, it is important to 
study ancient genome sequences from the Near East to provide insights into the history of 


the Basal Eurasians. 


Online Methods 


Archeological context, sampling and DNA extraction 


The Loschbour sample stems from a male skeleton excavated in 1935 at the Loschbour rock 
shelter in Heffingen, Luxembourg. The skeleton was AMS radiocarbon dated to 7,205 + 50 
years before present (OxA-7738; 6,220-5,990 cal BC)?!. At the Palaeogenetics Laboratory 
in Mainz, material for DNA extraction was sampled from tooth 16 (an upper right M1 
molar) after irradiation with UV-light, surface removal, and pulverization in a mixer mill. 
DNA extraction took place in the palaeogenetics facilities in the Institute for Archaeological 
Sciences at the University of Tübingen. Three extracts were made in total, one from 80 mg 


132 


of powder using an established silica based protocol’* and two additional extracts from 90 


mg of powder each with a protocol optimized for the recovery of short DNA molecules??. 
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The Stuttgart sample was taken from a female skeleton excavated in 1982 at the site 
Viesenháuser Hof, Stuttgart-Mühlhausen, Germany. It was attributed to the 
Linearbandkeramik (5,500-4,800 BC) through associated pottery artifacts and the 
chronology was corroborated by radiocarbon dating of the stratigraphy?^. Both sampling and 
DNA extraction took place in the Institute for Archaeological Sciences at the University of 
Tübingen. Tooth 47 (a lower right M2 molar) was removed and material from the inner part 
was sampled with a sterile dentistry drill. An extract was made using 40 mg of bone 
powder??, 


The Motala individuals were recovered from the site of Kanaljorden in the town of Motala, 
Óstergótland, Sweden, excavated between 2009 and 2013. The human remains at this site 
are represented by several adult skulls and one infant skeleton. All individuals are part of a 
ritual deposition at the bottom of a small lake. Direct radiocarbon dates on the remains range 
between 7,013 + 76 and 6,701 + 64 BP (6,361-5,516 cal BC), corresponding to the late 
Middle Mesolithic of Scandinavia. Samples were taken from the teeth of the nine best 
preserved skulls, as well as a femur and tibia. Bone powder was removed from the inner 
parts of the teeth or bones with a sterile dentistry drill. DNA from 100 mg of bone powder 
was extracted?’ in the ancient DNA laboratory of the Archaeological Research Laboratory, 
Stockholm. 


Library preparation 


Illumina sequencing libraries were prepared using either double- or single-stranded library 
preparation protocols*®7 (SI1). For high-coverage shotgun sequencing libraries, a DNA 
repair step with Uracil-DNA-glycosylase (UDG) and endonuclease VIII (endo VIIT) 
treatment was included in order to remove uracil residues?8. Size fractionation on a PAGE 
gel was also performed in order to remove longer DNA molecules that are more likely to be 
contaminants?". Positive and blank controls were carried along during every step of library 
preparation. 


Shotgun sequencing and read processing 


All non-UDG-treated libraries were sequenced either on an Illumina Genome Analyzer IIx 
with 2x76 + 7 cycles for the Loschbour and Motala libraries, or on an Illumina MiSeq with 
2x150 + 8 8 cycles for the Stuttgart library. We followed the manufacturer's protocol for 
multiplex sequencing. Raw overlapping forward and reverse reads were merged and filtered 
for quality?? and mapped to the human reference genome (hg19/GRCh37/1000Genomes) 
using the Burrows-Wheeler Aligner (BWA)^9 (SI2). For deeper sequencing, UDG-treated 
libraries of Loschbour were sequenced on 3 Illumina HiSeq 2000 lanes with 50-bp single- 
end reads, 8 Illumina HiSeq 2000 lanes of 100-bp paired-end reads and 8 Illumina HiSeq 
2500 lanes of 101-bp paired-end reads. The UDG-treated library for Stuttgart was sequenced 
on 8 HiSeq 2000 lanes and 101-bp paired-end reads. The UDG-treated libraries for Motala 
were sequenced on 8 HiSeq 2000 lanes of 100-bp paired-end reads, with 4 lanes each for 
two pools (one of 3 individuals and one of 4 individuals). We also sequenced an additional 8 
HiSeq 2000 lanes for Motala12, the Motala sample with the highest percentage of 
endogenous human DNA. For the Loschbour and Stuttgart high coverage individuals, 


diploid genotype calls were obtained using the Genome Analysis Toolkit (GATK)4!. 
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Enrichment of mitochondrial DNA and sequencing 


To test for DNA preservation and mtDNA contamination non-UDG-treated libraries of 
Loschbour and all Motala samples were enriched for human mitochondrial DNA using a 
bead-based capture approach with present-day human DNA as bait*”. UDG-treatment was 
omitted in order to allow characterization of damage patterns typical for ancient DNA!®. 
The captured libraries were sequenced on an Illumina Genome Analyzer IIx platform with 2 
x 76 +7 cycles and the resulting reads were merged and quality filtered??. The sequences 
were mapped to the Reconstructed Sapiens Reference Sequence, RSRS ^, using a custom 
iterative mapping assembler, MIA“ (SIA). 


Contamination estimates 


We assessed if the sequences had the characteristics of authentic ancient DNA using four 
approaches. First we searched for evidence of contamination by determining whether the 
sequences mapping to the mitochondrial genome were consistent with deriving from more 
than one individual^^^, Second, for the high-coverage Loschbour and Stuttgart genomes, 
we used a maximum-likelihood-based estimate of autosomal contamination that uses 
variation at sites that are fixed in the 1000 Genomes data to estimate error, heterozygosity 


and contamination^ó 


simultaneously. Third, we estimated contamination based on the rate of 
polymorphic sites on the X chromosome of the male Loschbour individual? (SI3) Fourth, 
we analyzed non-UDG treated reads mapping to the RSRS to search for aDNA-typical 


damage patterns resulting in CT changes at the 5’-end of the molecule! (S13). 


Phylogenetic analysis of the mitochondrial genomes 


All nine complete mitochondrial genomes that fulfilled the criteria of authenticity were 
assigned to haplogroups using Haplofind^$. A Maximum Parsimony tree including present 
day humans and previously published ancient mtDNA sequences was generated with 

MEGA ^, The effect of branch shortening due to a lower number of substitutions in ancient 
lineages was studied by calculating the nucleotide edit distance to the root for all haplogroup 
R sequences (SIA). 


Sex determination and Y-chromosome analysis 


We assessed the sex of all sequenced individuals by using the ratio of (chrY) to (chrY 
+chrX) aligned reads??. We downloaded a list of Y-chromosome SNPs curated by the 
International Society of Genetic Genealogy (ISOGG, http://www.isogg.org) v. 9.22 
(accessed Feb. 18, 2014) and determined the state of the ancient individuals at positions 
where a single allele was observed and MAPQ 230. We excluded C/G or A/T SNPs due to 
uncertainty about the polarity of the mutation in the database. The ancient individuals were 
assigned haplogroups based on their derived state (SIS). We also used BEAST v1.7.51?! to 
assess the phylogenetic position of Loschbour using 623 males from around the world with 
2,799 variant sites across 500kb of non-recombining Y-chromosome sequence” (S15). 


Estimation of Neanderthal admixture 


We estimate Neanderthal admixture in ancient individuals with the f4-ratio or S- 
statistic?^354 q = f,(Altai, Denisova; Test, Yoruba)/f4(Altai, Denisova; Vindija, Yoruba) 
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which uses whole genome data from Altai, a high coverage (52x) Neanderthal genome 
sequence??, Denisova, a high coverage sequence?" from another archaic human population 
(31x), and Vindija, a low coverage (1.3x) Neanderthal genome from a mixture of three 


Neanderthal individuals from Vindija Cave in Croatia?3. 


Inference of demographic history and inbreeding 


We used the Pairwise Sequentially Markovian Coalescent (PSMC)*° to infer the size of the 
ancestral population of Stuttgart and Loschbour. This analysis requires high quality diploid 
genotype calls and cannot be performed in the low-coverage Motala samples. To determine 
whether the low effective population size inferred for Loschbour is due to recent inbreeding, 
we plotted the time-to-most-recent common ancestor (TMRCA) along each of chr1-22 to 
detect runs of low TMRCA. 


Analysis of segmental duplications and copy number variants 


We built read-depth based copy number maps for the Loschbour, Stuttgart and Motala12 
genomes in addition to the Denisova and Altai Neanderthal genome and 25 deeply 
sequenced modern genomes? (SI7). We built these maps by aligning reads, subdivided into 
their non-overlapping 36-bp constituents, against the reference genome using the mrsFAST 
aligner??, and renormalizing read-depth for local GC content. We estimated copy numbers 
in windows of 500 unmasked base pairs slid at 100 bp intervals across the genome. We 
called copy number variants using a scale space filter algorithm. We genotyped variants of 
interest and compared the genotypes to those from individuals sequenced as part of the 1000 


Genomes Project?8, 


Phenotypic inference 


We inferred likely phenotypes (SI8) by analyzing DNA polymorphism data in the VCF 
format?? using VCFtools (http://vcftoools.sourceforge.net/). For the Loschbour and Stuttgart 
individuals, we included data from sites not flagged as LowQuality, with genotype quality 
(GQ) of 330, and SNP quality (QUAL) of 250. For Motalal12, which is of lower coverage, 
we included sites having at least 2x coverage and that passed visual inspection of the local 


alignment using samtools tview (http://samtools.sourceforge.net)© 


Human Origins dataset curation 


The Human Origins array consists of 14 panels of SNPs for which the ascertainment is well 
known®-©!, All population genetics analysis were carried out on a set of 594,924 autosomal 
SNPs, after restricting to sites that had >90% completeness across 7 different batches of 
sequencing, and that had >97.5% concordance with at least one of two subsets of samples 
for which whole genome sequencing data was also available. The total dataset consists of 
2,722 individuals, which we filtered to 2,345 individuals (203 populations) after removing 


14,62 or model-based 


outlier individuals or relatives based on visual inspection of PCA plots 
clustering analysis!?. Whole genome amplified (WGA) individuals were not used in 
analysis, except for a Saami individual who we included because of the special interest of 


this population for Northeastern European population history (Extended Data Fig. 7). 
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ADMIXTURE analysis 


We merged all Human Origins genotype data with whole genome sequencing data from 
Loschbour, Stuttgart, MA1, Motala12, Motala merge, and LaBrana. We then thinned the 
resulting dataset to remove SNPs in linkage-disequilibrium with PLINK 1.0793, using a 
window size of 200 SNPs advanced by 25 SNPs and an r°? threshold of 0.4. We ran 
ADMIXTURE 1.23!3: for 100 replicates with different starting random seeds, default 5- 
fold cross-validation, and varying the number of ancestral populations K between 2 and 20. 
We assessed clustering quality using CLUMPP®. We used the ADMIXTURE results to 
identify a set of 59 “West Eurasian" (European/Near Eastern) populations based on values 
of a “West Eurasian" ancestral population at K=3 (SI9). We also identified 15 populations 
for use as “non-West Eurasian outgroups” based on their having at least 10 individuals and 
no evidence of European or Near Eastern admixture at K=11, the lowest K for which Near 
Eastern/European-maximized ancestral populations appeared consistently across all 100 
replicates. 


Principal Components Analysis 


fz statistics 


f statistics 


We used smartpcal^ (version: 10210) from EIGENSOFT9266 5.0.1 to carry out Principal 
Components Analysis (PCA) (SI10). We performed PCA on a subset on individuals and 
then projected others using the /sqproject: YES option that gives an unbiased inference of 
the position of samples even in the presence of missing data (especially important for 
ancient DNA). 


. . N 
We use the f3-statistic® fa( Test; Re fyRef)- 4 ; 


i=1 
r2 į are the allele frequencies for the ith SNP in populations Test, Ref, Ref», respectively, to 


(t;—r14) (ti—r2,i} where ti, r1; and 


determine if there is evidence that the Test population is derived from admixture of 
populations related to Ref; and Ref; (SI11). A significantly negative statistic provides 
unambiguous evidence of mixture in the Test population’. We allow Ref; and Ref» to be any 
Human Origins population with 4 or more individuals, or Loschbour, Stuttgart, MAT, 
Motalal2, LaBrana. We assess significance of the f3-statistics using a block jackknife® and 
a block size of 5cM. We report significance as the number of standard errors by which the 
statistic differs from zero (Z-score). We also perform an analysis in which we constrain the 
reference populations to be (1) EEF (Stuttgart) and WHG (Loschbour or LaBrana), (ii) EEF 
and a Near Eastern population, (iii) EEF and ANE (MA1), or (iv) any two present-day 
populations, and compute a Zaire score between the lowest f3-statistic observed in the 
dataset, and the f3-statistic observed for the specified pair. 


We analyze f,-statistics® of the form fa(A, B;C, D)=4 MES (c; —d;)to assess if 
populations A, B are consistent with forming a clade in an unrooted tree with respect to C, 
D. If they form a clade, the allele frequency differences between the two pairs should be 
uncorrelated and the statistic has an expected value of 0. We set the outgroup D to be a sub- 


Saharan African population or Chimpanzee. We systematically tried all possible 
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combinations of the ancient samples or 15 “non-West Eurasian outgroups" identified by 
ADMIXTURE analysis as A, B, C to determine their genetic affinities (SI14). Setting A as a 
present-day test population and B as either Stuttgart or BedouinB, we documented 
relatedness to C=(Loschbour or MA1) or C=(MA1 and Karitiana) or C=(MA1 or Han) 
(Extended Data Figs. 4, 5, 7). Setting C as a test population and (A, B) a pair from 
(Loschbour, Stuttgart, MA1) we documented differential relatedness to ancient populations 
(Extended Data Fig. 6). We computed D-statistics?? using transversion polymorphisms in 
whole genome sequence data?? to confirm robustness to ascertainment and ancient DNA 
damage (Extended Data Table 2). 


Minimum number of source populations for Europeans 


We used gpWave!®!7 to study the minimum number of source populations for a designated 
set of Europeans (SI12). We use fy-statistics of the form X(I, r) = fa(lo, l; ro, r) where lo,ro 
are arbitrarily chosen “base” populations, and /, r are other populations from two sets L and 
R respectively. If X(I, r) has rank r and there were n waves of immigration into R with no 
back-migration from R to L, then r+/ <n. We set L to include Stuttgart, Loschbour, MA1, 
Onge, Karitiana, Mbuti and R to include 23 modern European populations who fit the model 
of SI14 and had admixture proportions within the interval [0,1] for the method with minimal 
modeling assumptions (SI17). 


Admixture proportions for Stuttgart in the absence of a Near Eastern ancient genome 


We used Loschbour and BedouinB as surrogates for “Unknown hunter-gatherer” and Near 
Eastern (NE) farmer populations that contributed to Stuttgart (SI13). Ancient Near Eastern 
ancestry in Stuttgart is estimated by the f;-ratio*? £,(Outgroup, X; Loschbour, 
StuttgartVf4( Outgroup, X; Loschbour, NE). A complication is that BedouinB is a mixture of 
NE and African ancestry. We therefore subtracted!’ the effects of African ancestry using 
estimates of the BedouinB African admixture proportion from ADMIXTURE (SI9) or 
ALDERSS, 


Admixture graph modeling 


We used ADMIXTUREGRAPHS (version 3110) to model population relationships between 
Loschbour, Stuttgart, Onge, and Karitiana using Mbuti as an African outgroup. We assessed 
model fit using a block jackknife of differences between estimated and fitted f-statistics for 
the set of included populations (we expressed the fit as a Z score). We determined that a 
model failed if |Z|>3 for at least one f-statistic. A basic tree model failed and we manually 
amended the model to test all possible models with a single admixture event, which also 
failed. Further manual amendment to include 2 admixture events resulted in 8 successful 
models, only one of which could be amended to also fit MA1 as an additional constraint. We 
successfully fit both the Iceman and LaBrana into this model as simple clades and Motala12 
as a 2-way mixture. We also fit present-day West Eurasians as clades, 2-way mixtures, or 3- 
way mixtures in this basic model, achieving a successful fit for a larger number of European 
populations (n=26) as 3-way mixtures. We estimated the individual admixture proportions 
from the fitted model parameters. To test if fitted parameters for different populations are 
consistent with each other, we jointly fit all pairs of populations A and B by modifying 
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ADMIXTUREGRAPH to add a large constant (10,000) to the variance term {3(Ag, A, B). By 
doing this, we can safely ignore recent gene flow within Europe that affects statistics that 
include both A and B. 


Ancestry estimates from f,ratios 


We estimate EEF ancestry using the ;-ratio*-? f£, Mbuti, Onge; Loschbour, European)! 

fa( Mbuti, Onge; Loschbour, Stuttgart), which produces consistent results with 
ADMIXTUREGRAPH (SI14). We use f4(Stuttgart, Loschbour; Onge MA1)/f4(Mbuti, MA1; 
Onge, Loschbour) to estimate Basal Eurasian admixture into Stuttgart. We use fa(Stuttgart, 
Loschbour; Onge Karitiana)/f4(Stuttgart, Loschbour; Onge MA1) to estimate ANE mixture 
in Karitiana (Fig. 2B). We use f,(Test, Stuttgart; Karitiana, Onge)/f4(MAI, Stuttgart; 
Karitiana, Onge) to lower bound ANE mixture into North Caucasian populations. 


MixMapper analysis 


We carried out MixMapper 2.0’ analysis, a semi-supervised admixture graph fitting 
technique. First, we infer a scaffold tree of populations without strong evidence of mixture 
relative to each other (Mbuti, Onge, Loschbour and MA1). We do not include European 
populations in the scaffold as all had significantly negative f5-statistics indicating admixture. 
We then ran MixMapper to infer the relatedness of the other ancient and present-day 
samples, fitting them onto the scaffold as 2- or 3-way mixtures. The uncertainty in all 
parameter estimates is measured by block bootstrap resampling of the SNP set (100 
replicates with 50 blocks). 


TreeMix analysis 


We applied TreeMix?! to Loschbour, Stuttgart, Motalal2, and MA13, LaBrana? and the 
Iceman!, along with the present-day samples of Karitiana, Onge and Mbuti. We restricted 
the analysis to 265,521 Human Origins array sites after excluding any SNPs where there 
were no-calls in any of the studied individuals. The tree was rooted with Mbuti and standard 
errors were estimated using blocks of 500 SNPs. We repeated the analysis on whole-genome 
sequence data, rooting with Chimp and replacing Onge with Dai since we did not have Onge 
whole genome sequence data??. We varied the number of migration events (m) between 0 
and 5. 


Inferring admixture proportions with minimal modeling assumptions 


We devised a method to infer ancestry proportions from three ancestral populations (EEF, 
WHG, and ANE) without strong phylogenetic assumptions (SI17). We rely on 15 “non- 
West Eurasian" outgroups and study f4( European, Stuttgart; O;, O5) which equals af 

fa( Loschbour, Stuttgart; Oj, O5) + a(1—f) f4(MAI, Stuttgart; O1, O5) if European has 1—a 
ancestry from EEF and £, 1—2 ancestry from WHG and ANE respectively. This defines a 


( E ) -105 

system of 2 equations with unknowns ap, a(1—f), which we solve with least 
squares implemented in the function /sfit in R to obtain estimates of a and f. We repeated 
this computation 22 times dropping one chromosome at a time?? to obtain block jackknife®7 
estimates of the ancestry proportions and standard errors, with block size equal to the 
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number of SNPs per chromosome. We assessed consistency of the inferred admixture 
proportions with those derived from the ADMIXTUREGRAPH model based on the number 
of standard errors between the two (Extended Data Table 1). 


Haplotype-based analyses 


We used RefinedIBD from BEAGLE 4? with the settings ibdtrim-20 and ibdwindow-25 to 
study IBD sharing between Loschbour and Stuttgart and populations from the POPRES 
dataset®. We kept all IBD tracts spanning at least 0.5 centimorgans (cM) and with a LOD 
score >3 (SI18). We also used ChromoPainter?? to study haplotype sharing between 
Loschbour and Stuttgart and present-day West Eurasian populations (SI19). We identified 
495,357 SNPs that were complete in all individuals and phased the data using Beagle 4?7 
with parameters phase-its=50 and impute-its=10. We did not keep sites with missing data to 
avoid imputing modern alleles into the ancient individuals. We used both unlinked (-k 1000) 
and linked modes (estimating -n and -M by sampling 10% of individuals). We combined 
ChromoPainter output for chromosomes 1-22 using ChromoCombine??. We carried out a 
PCA of the co-ancestry matrix using fineSTRUCTURE??. 


Extended Data 


Extended Data Figure 1. 
Photographs of analyzed ancient samples. 


(A) Loschbour skull; (B) Stuttgart skull, missing the lower right M2 we sampled; (C) 
excavation at Kanaljorden in Motala, Sweden; (D) Motala 1 in situ. 
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Extended Data Figure 2. 
Pairwise Sequential Markovian Coalescent (PSMC) analysis. 


(A) Inference of population size as a function of time, showing a very small recent 
population size over the most recent period in the ancestry of Loschbour (at least the last 5— 
10 thousand years). (B) Inferred time since the most recent common ancestor from the 
PSMC for chromosomes 20, 21, 22 (top to bottom); Stuttgart is plotted on top and 
Loschbour at bottom. 
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Extended Data Figure 3. 
ADMIXTURE analysis (K=2 to K=20). 


Ancient samples (Loschbour, Stuttgart, Motala_merge, Motala12, MA1, and LaBrana) are at 


left. 
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Extended Data Figure 4. 


fA(Test, Stuttgart; MA1, Chimp) 


ANE ancestry is present in both Europe and the Near East but WHG ancestry is restricted to 


Europe, which cannot be due to a single admixture event. 


(x-axis) We computed the statistic f4(Test, Stuttgart; MAI, Chimp), which measures where 
MAI shares more alleles with a test population than with Stuttgart. It is positive for most 


European and Near Eastern populations, consistent with ANE (MA1-related) gene flow into 


both regions. (y-axis) We computed the statistic f4( Test, Stuttgart; Loschbour, Chimp), 


which measures whether Loschbour shares more alleles with a test sample than with 


Stuttgart. Only European populations show positive values of this statistic, providing 


evidence of WHG (Loschbour-related) admixture only in Europeans. 
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Extended Data Figure 5. 


MAL is the best surrogate for ANE for which we have data. 

Europeans share more alleles with MAI than with Karitiana, as we see from the fact that in 
a plot of (Test, BedouinB; MAI, Chimp) and f4(Test, BedouinB; Karitiana, Chimp), the 
European cline deviates in the direction of MA1, rather than Karitiana (the slope is >1 and 
European populations are above the line indicating equality of these two statistics). 
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Extended Data Figure 6. 
The differential relatedness of West Eurasians to Stuttgart (EEF), Loschbour (WHG), and 


MAI (ANE) cannot be explained by two-way mixture. 

We plot on a West Eurasian map the statistic f4(Test, Chimp; Aj, Az), where A; and A» are a 
pair of the three ancient samples representing the three ancestral populations of Europe. (A) 
In both Europe and the Near East/Caucasus, populations from the south have more 
relatedness to Stuttgart than those from the north where ANE influence is also important. 
(B) Northern European populations share more alleles with Loschbour than with Stuttgart, 
as they have additional WHG ancestry beyond what was already present in EEF. (C) We 
Observe a striking contrast between Europe west of the Caucasus and the Near East in degree 
of relatedness to WHG. In Europe, there is a much higher degree of allele sharing with 
Loschbour than with MAI, which we ascribe to the 60-80% WHG/(WHG+ANE) ratio in 
most Europeans that we report in SI14. In contrast, the Near East has no appreciable WHG 
ancestry but some ANE ancestry, especially in the northern Caucasus. (Jewish populations 
are marked with a square in this figure to assist in interpretation as their ancestry is often 


anomalous for their geographic regions.) 
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Extended Data Figure 7. 
Evidence for Siberian gene flow into far northeastern Europe. 
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Some northeastern European populations (Chuvash, Finnish, Russian, Mordovian, Saami) 


share more alleles with Han Chinese than with other Europeans who are arrayed in a cline 


from Stuttgart to Lithuanians/Estonians in a plot of f4(Test, BedouinB; Han, Mbuti) against 
fa(Test, BedouinB; MAI, Mbuti). 


Extended Data Table 1 


West Eurasians genotyped on the Human Origins array and key f-statistics. 
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Note: Zdiff is the number of standard errors of the difference between the lowest f3-statistic over all reference pairs and the 


lowest f3-statistic for a subset of reference pairs. 
Abbreviations used: Stu: Stuttgart; Los: Loschbour; LaB: LaBrana. 
Extended Data Table 2 


Confirmation of key findings on transversions and on whole genome sequence data. 


D(A, B; C, D) on Human Origins genotype data D(A, B; C, D) on whole genome sequence data transversions 


Interpretation 


594,924 SNPs 110,817 transversions 


statistic Z statistic statistic 


Stuttgart has Stuttgart Armenian Loschbour Chimp 0.0219 4.5 0.0189 29 
Near Eastern 
ancestry 


Europeans Stuttgart French Loschbour Chimp -0.0266 -5.7 -0.031 -5.0 Stuttgart French2 Loschbour Chimp -0.03  -47 
have more 

WHG-related Lithuanian Stuttgart Loschbour Chimp 0.0446 9.1 0.0477 72 

ancestry than 

Stuttgart. 


West. French Stuttgart MAI Chimp 0.0367 97 0.0386 5.5 French? Stuttgart MAI Chimp 0.037 64 
Eurasians have 

more ANE- Lezgin Stuttgart MAI Chimp 0.0372 76 0.0409 56 

related 

ancestry than 

Stuttgart. 


MALisa French Chimp MAI Karitiana 0.0207 45 0.0214 2.8 French2 Chimp MAIL Karitiana2 0.026 3.8 
better 

surrogate of 

ANE ancestry 

than Karitiana 
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Chimp 


Chimp 


statistic 


Loschbour 


Papuan2 


Chimp 
Chimp 
Chimp 
Chimp 
Chimp 


Chimp 


Chimp 
Chimp 


Chimp 


Chimp 


Chimp 


Chimp 


minimal assumptions are from SI17. The estimates from the full modeling are from SI14 


either by single population analysis or co-fitting population pairs and averaging over fits 


0.013 


0.003 


0.002 


0.004 


0.01 


0.004 


(these averages are the results plotted in Fig. 2B). Populations that do not fit the models are 


not reported. 


Full modeling of 
population relationships 
(individual fits) 


EEF WHG 


ANE 


Full modeling of 
population relationships 
(averaged fits) 


WHG 


Range 


Modeling of populat 


tion 


relationships with 
minimal assumptions 


WHG 


Model-based (averaged) 
- Model with minimal 
assumptions (Z-score) 


EEF WHG ANE 


Albanian 0.781 0.092 
Ashkenazi Jew 0.931 0 
Basque 0.593 0.293 
Belarusian 0.418 0.431 
Bergamo 0.715 0.177 
Nature 


0.127 


0.069 


0.114 


0.151 


0.108 


0.781 


0.569 


0.426 


0.721 


0.772-0.819 


0.527-0.616 


0.397-0.464. 


0.704-0.793 


0.082 0.032-0.098 


0.335 0.255-0.392 


0.408 0.338-0.443 


0.163 0.061-0.189. 


0.137 0.129-0.158 


0.096 0.076-0.129 


0.167 0.150-0.199 


0.117 0.104-0.147 
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0.595 + 0.112 


0.938 + 0.146 


0.569 + 0.091 


0.272 + 0.094 


0.644 + 0.125 


0.353 + 0.150 


0.021 + 0.185 


0.315 + 0.124 


0.554 € 0.131 


0.248 + 0.170 


0.052 + 0.049 


0.083 + 0.049 


0.115 + 0.041 


0.174 + 0.047 


0.108 + 0.053 


1.658 —1.807 1.741 
0.001 0.165 -0472 
1.637 -1.118 -0.158 
0.615 -0.503 0.162 
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Full modeling of 


population relationships 


EEF 


Bulgarian 0712 
Croatian 0.561 
Czech 0495 
English 0495 
Estonian 0.322 
French 0.554 
French_South 0.675 
Greek 0.792 
Hungarian 0.558 
Icelandic 0.394 
Lithuanian 0.364 


Maltese 0.932 


Norwegian 0411 


Orcadian 0457 
Sardinian 0.817 
Scottish 0.39 
Sicilian 0.903 
Spanish 0.809 
Spanish_North 0713 
Tuscan 0.746 
Ukrainian 0462 
Finnish 

Mordovian 


Russian 


(individual fits) 


WHG 


0.147 


0.293 


0.338 


0.364 


0.495 


0311 


0.195 


0.058 
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ANE 


0.141 


0.145 


0.167 


0.141 


0.183 


0.135 


0.13 


0.151 


0.179 


0.15 


0.172 


0.068 


0.161 


0.158 


0.008 


0.182 


0.097 


0.123 


0.163 


0.118 


0.151 


0.718 


0.564 


0.489 


0.503 


0.323 


0.563 


0.636 


0.791 


0.548 


0.409 


0.352 


0.417 


0.465 


0.818 


0.408 


0.707-0.778 


0.548-0.586 


0.460-0.531 


0.476-0.536 


0.293-0.345 


0.537-0.601 


0,589-0.738 


0.780-0.816 


0,520-0.590 


0.386-0.424 


0.327-0.384 


0.388-0.438 


0.439-0.493 


0.791-0.874 


0.387-0.424 


0.736-0.804 


0.561-0.660 


0.737-0.806 


0.445-0.491 


Full modeling of 
population relationships 
(averaged fits) 


WHG 
Mean Range 
0.132 0.047-0.151 
0285 — 0242-0310 
0348 — 0273-0382 
0353 0,296-0.382 
0.49 0.451-0.520 
0397 0230-0328 
0.256 0.111-0.323 
0.048 0.019-0.060 
0.279 0.199-0.313 
0.448 0.409-0.473 


0.488 0.433-0.527 


0.383-0.450 
0.329-0.403 
0.058-0.182 


0.384-0.448 


0.066-0.170 
0.214-0.365 
0.047-0.145 


0,322-0.399. 


0.151 


0.151 


0.163 


0.144 


0.187 


0.14. 


0.108 


0.161 


0.174 


0.143 


0.16 


0.115 


0.096 


0.126 


0.16 


0.138-0.175 


0.137-0.172 


0.145-0.196 


0.130-0.169 


0.172-0.205 


0.126-0.169 


0.088-0.151 


0.150-0.171 


0.156-0.210 


0.126-0.170 


0.135-0.184 


0.140-0.181 


0.140-0.179 


0.026-0.068 


0.149-0.201 


0.091-0.151 


0.072-0.126 


0.114-0.150 


0.148-0.187 


0.556 + 0.110 


0.453 + 0.122 


0.402 + 0.117 


0.475 + 0.091 


0.072 + 0.121 


0.498 + 0.097 


0.636 + 0.116 


0.658 + 0.098 


0.391 + 0.109 


0.342 + 0.102 


0.248 + 0.117 


1.298 + 0.185 


0.273 + 0.115 


0.395 + 0.088 


0.883 + 0.128 


0.286 + 0.112 


1.012 + 0.149 


0.856 + 0.126 


0.581 + 0.120 


0.734 + 0.118 


0.259 + 0.123 


—0.299 + 0.204 


0.255 + 0.173 


0.303 + 0.211 


Modeling of population 
relationships with 
minimal assumptions 


WHG 


0.328 + 0.143 
0.407 + 0.159. 
0.400 + 0.162 
0.357 + 0.125 
0.778 + 0.176 
0.359 + 0.127 
0.225 + 0.165 
0.255 € 0.127 
0.454 + 0.153 
0.476 + 0.137 
0.548 + 0.163 
0.509 + 0.248 
0.557 + 0.161 
0.437 € 0.122 
0.075 + 0.166 
0.532 + 0.156 
0.131 + 0.199 
0.015 + 0.165 
0.298 + 0.158 
0.153 + 0.160 
0.596 + 0.173 
1.194 + 0.296 
1.151 € 0.246 


1.230 + 0.301 


0.116 + 0.043 


0.140 + 0.046 


0.198 + 0.050 


0.168 + 0.043 


0.150 + 0.064 


0.142 + 0.039 


0.140 + 0.057 


0.086 + 0.039 


0.155 + 0.050 


0.182 + 0.045 


0.205 + 0.052 


0.211 + 0.079 


0.170 + 0.055 


0.168 + 0.041 


0.042 + 0.048 


0.182 + 0.053 


0.119 + 0.060 


0.160 + 0.049 


0.121 + 0.046 


0.113 + 0.054 


0.145 + 0.057 


0.105 + 0.105 


0.104 + 0.090 


0.072 + 0.106 
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Model-based (averaged) 
- Model with minimal 
assumptions (Z-score) 


EEF 
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0.911 


0.744 


0.304 


2.070 


0.672 


0.003 


1.357 


1.437 
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0.886 


1.252 
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-0.510 
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WHG 
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—0.028 
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-1.627 
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-0.487 
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—0.038 
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-1.269 


ANE 


0.804 


0.238 
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0.584 
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Figure 1. Map of West Eurasian populations and Principal Component Analysis 
(a) Geographical locations of analyzed samples, with color coding matching the PCA. We 


show all sampling locations for each population, which results in multiple points for some 
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(e.g., Spain). (b) PCA on all present-day West Eurasians, with ancient and selected eastern 
non-African samples projected. European hunter-gatherers fall beyond present-day 
Europeans in the direction of European differentiation from the Near East. Stuttgart clusters 
with other Neolithic Europeans and present-day Sardinians. MA1 falls outside the variation 
of present-day West Eurasians in the direction of southern-northern differentiation along 


dimension 2. 
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Figure 2. Modeling of West Eurasian population history 
(a) A three-way mixture model that is a fit to the data for many populations. Present-day 


samples are colored in blue, ancient in red, and reconstructed ancestral populations in green. 
Solid lines represent descent without mixture, and dashed lines represent admixture. We 
print mixture proportions and one standard error for the two mixtures relating the highly 
divergent ancestral populations. (We do not print the estimate for the “European” population 
as it varies depending on the population). (b) We plot the proportions of ancestry from each 
of three inferred ancestral populations (EEF, ANE and WHG). 
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Table 1 


Lowest f3-statistics for each West Eurasian population 


Ref, Ref; Target for which these two references give the lowest f3(X; Ref), Ref) 


WHG EEF Sardinian*** 


Basque, Belarusian, Czech, English, Estonian, Finnish, French_South, Icelandic, Lithuanian, Mordovian, 
Norwegian, Orcadian, Scottish, Spanish, Spanish_North, Ukrainian 


WHG Siberian Russian 


WHG Near East 


Abkhasian***, Albanian, Ashkenazi Jew****, Bergamo, Bulgarian, Chechen****, Croatian, Cypriot****, 
Druze**, French, Greek, Hungarian, Lezgin, Maltese, Sicilian, Turkish Jew, Tuscan 


EEF Native American — Adygei, Balkar, Iranian, Kumyk, North Ossetian, Turkish 


BedouinA, BedouinB', Jordanian, Lebanese, Libyan Jew, Moroccan Jew, Palestinian, Saudi****, Syrian, 


EEF African He . ; é 
Tunisian_Jew***, Yemenite_Jew*** 


EEF South Asian Armenian, Georgian****, Georgian Jew*, Iranian Jew***, Iraqi_Jew*** 


Note: WHG = Loschbour or LaBrafia; EEF=Stuttgart; ANE=MA1; Native American=Piapoco; African-Esan, Gambian, or Kgalagadi; South 


Asian=GujaratiC or Vishwabrahmin. Statistics are negative with Z<-4 unless otherwise noted: T (positive) or *, **, ****, *****. to indicate Z less 
than 0, —1, —2, and —3 respectively. The complete list of statistics can be found in Extended Data Table 1. 
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